Monaural speech segregation using synthetic speech signals.
نویسندگان
چکیده
When listening to natural speech, listeners are fairly adept at using cues such as pitch, vocal tract length, prosody, and level differences to extract a target speech signal from an interfering speech masker. However, little is known about the cues that listeners might use to segregate synthetic speech signals that retain the intelligibility characteristics of speech but lack many of the features that listeners normally use to segregate competing talkers. In this experiment, intelligibility was measured in a diotic listening task that required the segregation of two simultaneously presented synthetic sentences. Three types of synthetic signals were created: (1) sine-wave speech (SWS); (2) modulated noise-band speech (MNB); and (3) modulated sine-band speech (MSB). The listeners performed worse for all three types of synthetic signals than they did with natural speech signals, particularly at low signal-to-noise ratio (SNR) values. Of the three synthetic signals, the results indicate that SWS signals preserve more of the voice characteristics used for speech segregation than MNB and MSB signals. These findings have implications for cochlear implant users, who rely on signals very similar to MNB speech and thus are likely to have difficulty understanding speech in cocktail-party listening environments.
منابع مشابه
Pitch-based monaural segregation of reverberant speech.
In everyday listening, both background noise and reverberation degrade the speech signal. Psychoacoustic evidence suggests that human speech perception under reverberant conditions relies mostly on monaural processing. While speech segregation based on periodicity has achieved considerable progress in handling additive noise, little research in monaural segregation has been devoted to reverbera...
متن کاملOn Amplitude Modulation for Monaural Speech Segregation
We propose a computational auditory scene analysis (CASA) model for monaural speech segregation. It deals with low-frequency and high-frequency signals differently. For high-frequency signals, it generates segments based on common amplitude modulation (AM) and groups them according to AM repetition rates. This model performs substantially better than previous CASA systems.
متن کاملIdeal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions
Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...
متن کاملMonaural speech/music source separation using discrete energy separation algorithm
In this paper, we address the problem of monaural source separation of a mixed signal containing speech and music components. We use Discrete Energy Separation Algorithm (DESA) to estimate frequency-modulating (FM) signal energy. The FM signal energy is used to design a time-varying filter in the time–frequency domain for rejecting the interfering signal. The FM signal energy was chosen due to ...
متن کاملMonaural speech segregation based on pitch track correction using an ensemble kalman filter
We propose a novel method of pitch track correction that uses an ensemble Kalman filter to improve the performance of monaural speech segregation. The proposed method considers all reliable pitch streaks for pitch track correction, whereas the conventional segregation approach relies on only the longest streak in a given speech stream. In addition, unreliable pitch streaks are corrected with an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The Journal of the Acoustical Society of America
دوره 119 4 شماره
صفحات -
تاریخ انتشار 2006